Overview

Dataset statistics

Number of variables22
Number of observations25652
Missing cells64490
Missing cells (%)11.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.3 MiB
Average record size in memory176.0 B

Variable types

NUM11
CAT6
BOOL5

Warnings

locality has a high cardinality: 867 distinct values High cardinality
property_subtype has a high cardinality: 178 distinct values High cardinality
Unnamed: 0 is highly correlated with df_indexHigh correlation
df_index is highly correlated with Unnamed: 0High correlation
area has 1786 (7.0%) missing values Missing
kitchen_has has 2187 (8.5%) missing values Missing
furnished has 2832 (11.0%) missing values Missing
open_fire has 2658 (10.4%) missing values Missing
terrace has 6534 (25.5%) missing values Missing
terrace_area has 11427 (44.5%) missing values Missing
garden has 3685 (14.4%) missing values Missing
garden_area has 10074 (39.3%) missing values Missing
land_surface has 6269 (24.4%) missing values Missing
land_plot_surface has 8371 (32.6%) missing values Missing
facades_number has 5580 (21.8%) missing values Missing
swimming_pool_has has 2971 (11.6%) missing values Missing
rooms_number is highly skewed (γ1 = 26.40591264) Skewed
area is highly skewed (γ1 = 67.92246901) Skewed
terrace_area is highly skewed (γ1 = 50.34765983) Skewed
garden is highly skewed (γ1 = 95.45319567) Skewed
garden_area is highly skewed (γ1 = 26.60671719) Skewed
land_surface is highly skewed (γ1 = 113.7686367) Skewed
land_plot_surface is highly skewed (γ1 = 38.43006696) Skewed
df_index has unique values Unique
Unnamed: 0 has unique values Unique
rooms_number has 601 (2.3%) zeros Zeros
area has 983 (3.8%) zeros Zeros
terrace_area has 6711 (26.2%) zeros Zeros
garden has 15743 (61.4%) zeros Zeros
garden_area has 11761 (45.8%) zeros Zeros
land_surface has 11203 (43.7%) zeros Zeros
land_plot_surface has 911 (3.6%) zeros Zeros
facades_number has 8610 (33.6%) zeros Zeros

Reproduction

Analysis started2020-11-19 10:34:05.092720
Analysis finished2020-11-19 10:34:49.601441
Duration44.51 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct25652
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25616.43735
Minimum0
Maximum51302
Zeros1
Zeros (%)< 0.1%
Memory size200.4 KiB
2020-11-19T11:34:49.776921image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2580.55
Q112796.5
median25497.5
Q338510.25
95-th percentile48763.9
Maximum51302
Range51302
Interquartile range (IQR)25713.75

Descriptive statistics

Standard deviation14827.02187
Coefficient of variation (CV)0.5788088979
Kurtosis-1.202410756
Mean25616.43735
Median Absolute Deviation (MAD)12861
Skewness0.006107397401
Sum657112851
Variance219840577.6
MonotocityNot monotonic
2020-11-19T11:34:50.029174image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
102351< 0.1%
 
136441< 0.1%
 
463961< 0.1%
 
484451< 0.1%
 
423021< 0.1%
 
443511< 0.1%
 
218241< 0.1%
 
177301< 0.1%
 
300201< 0.1%
 
279751< 0.1%
 
Other values (25642)25642> 99.9%
 
ValueCountFrequency (%) 
01< 0.1%
 
11< 0.1%
 
21< 0.1%
 
31< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
513021< 0.1%
 
513011< 0.1%
 
513001< 0.1%
 
512991< 0.1%
 
512981< 0.1%
 

Unnamed: 0
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct25652
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25616.43735
Minimum0
Maximum51302
Zeros1
Zeros (%)< 0.1%
Memory size200.4 KiB
2020-11-19T11:34:50.297403image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2580.55
Q112796.5
median25497.5
Q338510.25
95-th percentile48763.9
Maximum51302
Range51302
Interquartile range (IQR)25713.75

Descriptive statistics

Standard deviation14827.02187
Coefficient of variation (CV)0.5788088979
Kurtosis-1.202410756
Mean25616.43735
Median Absolute Deviation (MAD)12861
Skewness0.006107397401
Sum657112851
Variance219840577.6
MonotocityNot monotonic
2020-11-19T11:34:50.805736image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
102351< 0.1%
 
136441< 0.1%
 
463961< 0.1%
 
484451< 0.1%
 
423021< 0.1%
 
443511< 0.1%
 
218241< 0.1%
 
177301< 0.1%
 
300201< 0.1%
 
279751< 0.1%
 
Other values (25642)25642> 99.9%
 
ValueCountFrequency (%) 
01< 0.1%
 
11< 0.1%
 
21< 0.1%
 
31< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
513021< 0.1%
 
513011< 0.1%
 
513001< 0.1%
 
512991< 0.1%
 
512981< 0.1%
 

locality
Categorical

HIGH CARDINALITY

Distinct867
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size200.4 KiB
unknown
11707 
8300
 
562
1180
 
471
1000
 
384
1050
 
352
Other values (862)
12176 
ValueCountFrequency (%) 
unknown1170745.6%
 
83005622.2%
 
11804711.8%
 
10003841.5%
 
10503521.4%
 
90002901.1%
 
84002200.9%
 
40001640.6%
 
12001590.6%
 
10701460.6%
 
Other values (857)1119743.6%
 
2020-11-19T11:34:51.091800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique117 ?
Unique (%)0.5%
2020-11-19T11:34:51.336296image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length7
Median length4
Mean length5.369133011
Min length4

house_is
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size200.4 KiB
True
12339 
False
11007 
unknown
2306 
ValueCountFrequency (%) 
True1233948.1%
 
False1100742.9%
 
unknown23069.0%
 
2020-11-19T11:34:51.566770image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-19T11:34:51.734045image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:51.876199image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length7
Median length5
Mean length4.698775924
Min length4

property_subtype
Categorical

HIGH CARDINALITY

Distinct178
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size200.4 KiB
HOUSE
7678 
APARTMENT
5000 
house
1787 
apartment
1766 
VILLA
1607 
Other values (173)
7814 
ValueCountFrequency (%) 
HOUSE767829.9%
 
APARTMENT500019.5%
 
house17877.0%
 
apartment17666.9%
 
VILLA16076.3%
 
APARTMENT_BLOCK9213.6%
 
MIXED_USE_BUILDING8653.4%
 
Apartment6392.5%
 
PENTHOUSE4541.8%
 
DUPLEX4391.7%
 
Other values (168)449617.5%
 
2020-11-19T11:34:52.152914image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique35 ?
Unique (%)0.1%
2020-11-19T11:34:52.446142image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length35
Median length6
Mean length7.842585373
Min length1

price
Real number (ℝ≥0)

Distinct2072
Distinct (%)8.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean432368.0558
Minimum0
Maximum15000000
Zeros33
Zeros (%)0.1%
Memory size200.4 KiB
2020-11-19T11:34:52.670084image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q1190000
median294983
Q3460000
95-th percentile1300000
Maximum15000000
Range15000000
Interquartile range (IQR)270000

Descriptive statistics

Standard deviation556720.9564
Coefficient of variation (CV)1.287608899
Kurtosis82.50475865
Mean432368.0558
Median Absolute Deviation (MAD)124017
Skewness6.222075963
Sum1.109110537e+10
Variance3.099382233e+11
MonotocityNot monotonic
2020-11-19T11:34:52.902103image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
17723.0%
 
2950003051.2%
 
22981.2%
 
2750002941.1%
 
2490002891.1%
 
1990002831.1%
 
3950002721.1%
 
2250002641.0%
 
2990002451.0%
 
3490002270.9%
 
Other values (2062)2240387.3%
 
ValueCountFrequency (%) 
0330.1%
 
17723.0%
 
22981.2%
 
31050.4%
 
4810.3%
 
ValueCountFrequency (%) 
150000003< 0.1%
 
95000001< 0.1%
 
87500001< 0.1%
 
67325002< 0.1%
 
67000002< 0.1%
 

sale
Categorical

Distinct17
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size200.4 KiB
Unknown
13883 
residential_sale
4750 
unknown
4726 
Apartment
1467 
first_session_with_reserve_price
 
297
Other values (12)
 
529
ValueCountFrequency (%) 
Unknown1388354.1%
 
residential_sale475018.5%
 
unknown472618.4%
 
Apartment14675.7%
 
first_session_with_reserve_price2971.2%
 
Wohnung1420.6%
 
Public Sale890.3%
 
Huis670.3%
 
House630.2%
 
Maison530.2%
 
Other values (7)1150.4%
 
2020-11-19T11:34:53.167829image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
2020-11-19T11:34:53.363166image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length32
Median length7
Mean length9.10030407
Min length4

rooms_number
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct45
Distinct (%)0.2%
Missing116
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean3.421209273
Minimum0
Maximum204
Zeros601
Zeros (%)2.3%
Memory size200.4 KiB
2020-11-19T11:34:53.609106image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q34
95-th percentile7
Maximum204
Range204
Interquartile range (IQR)2

Descriptive statistics

Standard deviation3.456006047
Coefficient of variation (CV)1.010170899
Kurtosis1231.973886
Mean3.421209273
Median Absolute Deviation (MAD)1
Skewness26.40591264
Sum87364
Variance11.9439778
MonotocityNot monotonic
2020-11-19T11:34:53.878237image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%) 
3778230.3%
 
2636224.8%
 
4404215.8%
 
520828.1%
 
119307.5%
 
611904.6%
 
06012.3%
 
75782.3%
 
83311.3%
 
92220.9%
 
Other values (35)4161.6%
 
ValueCountFrequency (%) 
06012.3%
 
119307.5%
 
2636224.8%
 
3778230.3%
 
4404215.8%
 
ValueCountFrequency (%) 
2042< 0.1%
 
1651< 0.1%
 
1003< 0.1%
 
991< 0.1%
 
902< 0.1%
 

area
Real number (ℝ≥0)

MISSING
SKEWED
ZEROS

Distinct1664
Distinct (%)7.0%
Missing1786
Missing (%)7.0%
Infinite0
Infinite (%)0.0%
Mean19345.44472
Minimum0
Maximum73500000
Zeros983
Zeros (%)3.8%
Memory size200.4 KiB
2020-11-19T11:34:54.122831image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile32
Q199
median154
Q3254
95-th percentile1948
Maximum73500000
Range73500000
Interquartile range (IQR)155

Descriptive statistics

Standard deviation809853.4989
Coefficient of variation (CV)41.86274912
Kurtosis5094.106
Mean19345.44472
Median Absolute Deviation (MAD)66
Skewness67.92246901
Sum461698383.7
Variance6.558626897e+11
MonotocityNot monotonic
2020-11-19T11:34:54.350832image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
09833.8%
 
1204191.6%
 
1504041.6%
 
1003761.5%
 
1603571.4%
 
903431.3%
 
2003411.3%
 
1403311.3%
 
1103041.2%
 
802981.2%
 
Other values (1654)1971076.8%
 
(Missing)17867.0%
 
ValueCountFrequency (%) 
09833.8%
 
1290.1%
 
2180.1%
 
3130.1%
 
46< 0.1%
 
ValueCountFrequency (%) 
735000001< 0.1%
 
560740002< 0.1%
 
350000001< 0.1%
 
290000002< 0.1%
 
264880001< 0.1%
 

kitchen_has
Boolean

MISSING

Distinct2
Distinct (%)< 0.1%
Missing2187
Missing (%)8.5%
Memory size200.4 KiB
1
17557 
0
5908 
(Missing)
2187 
ValueCountFrequency (%) 
11755768.4%
 
0590823.0%
 
(Missing)21878.5%
 
2020-11-19T11:34:54.555413image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

furnished
Boolean

MISSING

Distinct2
Distinct (%)< 0.1%
Missing2832
Missing (%)11.0%
Memory size200.4 KiB
0
19918 
1
2902 
(Missing)
2832 
ValueCountFrequency (%) 
01991877.6%
 
1290211.3%
 
(Missing)283211.0%
 
2020-11-19T11:34:54.621732image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

open_fire
Boolean

MISSING

Distinct2
Distinct (%)< 0.1%
Missing2658
Missing (%)10.4%
Memory size200.4 KiB
0
21611 
1
 
1383
(Missing)
2658 
ValueCountFrequency (%) 
02161184.2%
 
113835.4%
 
(Missing)265810.4%
 
2020-11-19T11:34:54.926527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

terrace
Boolean

MISSING

Distinct2
Distinct (%)< 0.1%
Missing6534
Missing (%)25.5%
Memory size200.4 KiB
1
11058 
0
8060 
(Missing)
6534 
ValueCountFrequency (%) 
11105843.1%
 
0806031.4%
 
(Missing)653425.5%
 
2020-11-19T11:34:54.983737image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

terrace_area
Real number (ℝ≥0)

MISSING
SKEWED
ZEROS

Distinct161
Distinct (%)1.1%
Missing11427
Missing (%)44.5%
Infinite0
Infinite (%)0.0%
Mean14.92674868
Minimum0
Maximum3749
Zeros6711
Zeros (%)26.2%
Memory size200.4 KiB
2020-11-19T11:34:55.109017image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median4
Q320
95-th percentile60
Maximum3749
Range3749
Interquartile range (IQR)20

Descriptive statistics

Standard deviation42.25307551
Coefficient of variation (CV)2.830695177
Kurtosis4302.958711
Mean14.92674868
Median Absolute Deviation (MAD)4
Skewness50.34765983
Sum212333
Variance1785.32239
MonotocityNot monotonic
2020-11-19T11:34:55.368232image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0671126.2%
 
204501.8%
 
104011.6%
 
153651.4%
 
83301.3%
 
63091.2%
 
302941.1%
 
122861.1%
 
252661.0%
 
402410.9%
 
Other values (151)457217.8%
 
(Missing)1142744.5%
 
ValueCountFrequency (%) 
0671126.2%
 
1320.1%
 
21360.5%
 
31750.7%
 
42130.8%
 
ValueCountFrequency (%) 
37491< 0.1%
 
7081< 0.1%
 
5841< 0.1%
 
4951< 0.1%
 
4503< 0.1%
 

garden
Real number (ℝ≥0)

MISSING
SKEWED
ZEROS

Distinct120
Distinct (%)0.5%
Missing3685
Missing (%)14.4%
Infinite0
Infinite (%)0.0%
Mean2.98520508
Minimum0
Maximum3749
Zeros15743
Zeros (%)61.4%
Memory size200.4 KiB
2020-11-19T11:34:55.651000image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile15
Maximum3749
Range3749
Interquartile range (IQR)1

Descriptive statistics

Standard deviation29.48523751
Coefficient of variation (CV)9.877122917
Kurtosis11890.21581
Mean2.98520508
Median Absolute Deviation (MAD)0
Skewness95.45319567
Sum65576
Variance869.3792312
MonotocityNot monotonic
2020-11-19T11:34:55.910396image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01574361.4%
 
1465418.1%
 
201330.5%
 
301030.4%
 
15970.4%
 
25830.3%
 
10730.3%
 
40730.3%
 
50690.3%
 
12510.2%
 
Other values (110)8883.5%
 
(Missing)368514.4%
 
ValueCountFrequency (%) 
01574361.4%
 
1465418.1%
 
212< 0.1%
 
311< 0.1%
 
4320.1%
 
ValueCountFrequency (%) 
37491< 0.1%
 
7081< 0.1%
 
4502< 0.1%
 
4002< 0.1%
 
3502< 0.1%
 

garden_area
Real number (ℝ≥0)

MISSING
SKEWED
ZEROS

Distinct731
Distinct (%)4.7%
Missing10074
Missing (%)39.3%
Infinite0
Infinite (%)0.0%
Mean223.2714084
Minimum0
Maximum94000
Zeros11761
Zeros (%)45.8%
Memory size200.4 KiB
2020-11-19T11:34:56.194396image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile800
Maximum94000
Range94000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2014.178429
Coefficient of variation (CV)9.021210745
Kurtosis876.3142327
Mean223.2714084
Median Absolute Deviation (MAD)0
Skewness26.60671719
Sum3478122
Variance4056914.742
MonotocityNot monotonic
2020-11-19T11:34:56.461936image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01176145.8%
 
12390.9%
 
1001270.5%
 
50910.4%
 
200890.3%
 
300870.3%
 
500730.3%
 
150670.3%
 
30640.2%
 
60620.2%
 
Other values (721)291811.4%
 
(Missing)1007439.3%
 
ValueCountFrequency (%) 
01176145.8%
 
12390.9%
 
21< 0.1%
 
32< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
940001< 0.1%
 
750002< 0.1%
 
630002< 0.1%
 
580001< 0.1%
 
550002< 0.1%
 

land_surface
Real number (ℝ≥0)

MISSING
SKEWED
ZEROS

Distinct1831
Distinct (%)9.4%
Missing6269
Missing (%)24.4%
Infinite0
Infinite (%)0.0%
Mean690.7379147
Minimum0
Maximum1379000
Zeros11203
Zeros (%)43.7%
Memory size200.4 KiB
2020-11-19T11:34:56.733375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3301
95-th percentile2103.6
Maximum1379000
Range1379000
Interquartile range (IQR)301

Descriptive statistics

Standard deviation10619.48957
Coefficient of variation (CV)15.37412287
Kurtosis14651.97134
Mean690.7379147
Median Absolute Deviation (MAD)0
Skewness113.7686367
Sum13388573
Variance112773558.7
MonotocityNot monotonic
2020-11-19T11:34:57.000459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01120343.7%
 
1001040.4%
 
150920.4%
 
300810.3%
 
200750.3%
 
400690.3%
 
120650.3%
 
250620.2%
 
1000610.2%
 
50520.2%
 
Other values (1821)751929.3%
 
(Missing)626924.4%
 
ValueCountFrequency (%) 
01120343.7%
 
1420.2%
 
22< 0.1%
 
31< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
13790001< 0.1%
 
1500002< 0.1%
 
1178002< 0.1%
 
1100001< 0.1%
 
1035531< 0.1%
 

land_plot_surface
Real number (ℝ≥0)

MISSING
SKEWED
ZEROS

Distinct2867
Distinct (%)16.6%
Missing8371
Missing (%)32.6%
Infinite0
Infinite (%)0.0%
Mean8845478.135
Minimum0
Maximum1.35e+10
Zeros911
Zeros (%)3.6%
Memory size200.4 KiB
2020-11-19T11:34:57.266376image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q199
median255
Q3970
95-th percentile2150000
Maximum1.35e+10
Range1.35e+10
Interquartile range (IQR)871

Descriptive statistics

Standard deviation260663245.9
Coefficient of variation (CV)29.46853092
Kurtosis1631.754895
Mean8845478.135
Median Absolute Deviation (MAD)204
Skewness38.43006696
Sum1.528587076e+11
Variance6.794532779e+16
MonotocityNot monotonic
2020-11-19T11:34:57.562630image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
09113.6%
 
1001690.7%
 
901630.6%
 
801440.6%
 
701410.5%
 
1101390.5%
 
1201390.5%
 
1501310.5%
 
2001100.4%
 
851100.4%
 
Other values (2857)1512459.0%
 
(Missing)837132.6%
 
ValueCountFrequency (%) 
09113.6%
 
1320.1%
 
1.281< 0.1%
 
1.641< 0.1%
 
1.771< 0.1%
 
ValueCountFrequency (%) 
1.35e+101< 0.1%
 
1.3e+101< 0.1%
 
1.28e+101< 0.1%
 
1.18e+101< 0.1%
 
81000000001< 0.1%
 

facades_number
Real number (ℝ≥0)

MISSING
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing5580
Missing (%)21.8%
Infinite0
Infinite (%)0.0%
Mean1.625548027
Minimum0
Maximum10
Zeros8610
Zeros (%)33.6%
Memory size200.4 KiB
2020-11-19T11:34:57.810876image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q33
95-th percentile4
Maximum10
Range10
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.559150754
Coefficient of variation (CV)0.9591539146
Kurtosis-1.376505337
Mean1.625548027
Median Absolute Deviation (MAD)2
Skewness0.2495757249
Sum32628
Variance2.430951072
MonotocityNot monotonic
2020-11-19T11:34:58.001227image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%) 
0861033.6%
 
2506619.7%
 
4354813.8%
 
3271910.6%
 
11270.5%
 
102< 0.1%
 
(Missing)558021.8%
 
ValueCountFrequency (%) 
0861033.6%
 
11270.5%
 
2506619.7%
 
3271910.6%
 
4354813.8%
 
ValueCountFrequency (%) 
102< 0.1%
 
4354813.8%
 
3271910.6%
 
2506619.7%
 
11270.5%
 

swimming_pool_has
Boolean

MISSING

Distinct2
Distinct (%)< 0.1%
Missing2971
Missing (%)11.6%
Memory size200.4 KiB
0
21694 
1
 
987
(Missing)
2971 
ValueCountFrequency (%) 
02169484.6%
 
19873.8%
 
(Missing)297111.6%
 
2020-11-19T11:34:58.160352image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

building_state
Categorical

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size200.4 KiB
Not specified
13146 
AS_NEW
5466 
GOOD
3803 
TO_BE_DONE_UP
 
1037
TO_RENOVATE
 
892
Other values (4)
 
1308
ValueCountFrequency (%) 
Not specified1314651.2%
 
AS_NEW546621.3%
 
GOOD380314.8%
 
TO_BE_DONE_UP10374.0%
 
TO_RENOVATE8923.5%
 
JUST_RENOVATED8063.1%
 
old2400.9%
 
New1980.8%
 
TO_RESTORE640.2%
 
2020-11-19T11:34:58.287107image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-19T11:34:58.465383image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:58.676124image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length14
Median length13
Mean length9.95778107
Min length3

region
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size200.4 KiB
unknown
11707 
Flanders
7189 
Wallonia
4223 
Brussels
2533 
ValueCountFrequency (%) 
unknown1170745.6%
 
Flanders718928.0%
 
Wallonia422316.5%
 
Brussels25339.9%
 
2020-11-19T11:34:59.106672image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-19T11:34:59.265575image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:59.421875image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length8
Mean length7.54362233
Min length7

Interactions

2020-11-19T11:34:23.057869image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:23.260466image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:23.452652image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:23.650420image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:23.845773image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:24.023858image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:24.215214image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:24.395576image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:24.567598image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:24.755840image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:24.945685image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:25.143083image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:25.326755image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:25.512379image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:25.700648image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:25.884965image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:26.050600image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:26.419458image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:26.600482image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:26.778600image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:26.963730image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:27.143994image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:27.332070image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:27.520945image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:27.708388image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:27.897227image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:28.087919image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:28.262898image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:28.445228image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:28.626248image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:28.805326image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:28.993530image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:29.174682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:29.370077image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:29.563768image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:29.759543image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:29.949465image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:30.181907image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:30.363524image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:30.544673image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:30.732920image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:30.910801image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:31.096036image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:31.282117image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:31.484672image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:31.655935image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:32.017084image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:32.195304image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:32.367745image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:32.518189image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:32.682399image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:32.844265image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:32.998316image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:33.159900image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:33.324656image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:33.517118image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:33.696400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:33.883901image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:34.067191image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:34.256377image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:34.427451image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:34.611897image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:34.792121image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:34.963296image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:35.127326image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:35.296987image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:35.483079image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:35.668071image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:35.851942image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:36.032953image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:36.230054image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:36.404329image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:36.578913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:36.755898image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:36.928121image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:37.100386image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:37.270658image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:37.694116image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:37.870416image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:38.042063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:38.227062image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:38.410510image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:38.566188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:38.735905image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:38.909222image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:39.063183image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:39.227449image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:39.406096image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:39.589239image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:39.795041image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:40.004127image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:40.212022image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:40.419370image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:40.601484image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:40.797582image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:40.980699image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:41.148357image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:41.334922image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:41.543432image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:41.755311image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:41.957186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:42.159655image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:42.364432image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:42.566884image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:42.748351image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:42.936785image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:43.109130image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:43.509153image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:43.722221image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:43.933021image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:44.142834image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:44.352747image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:44.567804image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:44.776953image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:44.983123image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:45.152652image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:45.339992image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:45.533557image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:45.734784image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:45.951729image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:46.167190image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-11-19T11:34:59.657654image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-11-19T11:35:00.032324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-11-19T11:35:00.509484image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-11-19T11:35:00.882669image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-11-19T11:35:01.222493image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-11-19T11:34:46.650809image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:47.735007image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:48.594290image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-19T11:34:49.159467image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

df_indexUnnamed: 0localityhouse_isproperty_subtypepricesalerooms_numberareakitchen_hasfurnishedopen_fireterraceterrace_areagardengarden_arealand_surfaceland_plot_surfacefacades_numberswimming_pool_hasbuilding_stateregion
032349323491380TrueHOUSE1650000.0Unknown4.0350.01.00.01.00.0NaN0.00.04040.0NaN4.00.0AS_NEWWallonia
112426124261970FalseAPARTMENT332956.0Unknown2.0101.01.00.00.01.018.00.00.00.0101.00.00.0Not specifiedFlanders
212185121858400FalseAPARTMENT159000.0Unknown2.090.01.00.00.01.00.00.00.00.090.00.00.0TO_BE_DONE_UPFlanders
32676026760unknownTruevilla230000.0unknown3.0110.01.00.00.00.0NaN1.0700.0NaN824.04.00.0Not specifiedunknown
416985169859040FalseAPARTMENT195000.0Unknown2.00.01.00.00.00.00.00.00.00.00.00.00.0Not specifiedFlanders
52948329483unknownFalseapartment275000.0unknown1.099.01.00.00.00.0NaN0.0NaNNaNNaN2.00.0Not specifiedunknown
6533753376637TrueHOUSE210000.0Unknown4.00.01.00.00.00.00.00.00.01600.01600.00.00.0Not specifiedWallonia
75077750777unknownFalseAPARTMENT895000.0residential_sale3.0227.01.0NaN0.0NaNNaN50.0NaN0.0NaN3.0NaNAS_NEWunknown
8816281624530TrueHOUSE265000.0Unknown3.0185.01.00.00.01.030.01.0170.00.0185.00.00.0Not specifiedWallonia
931178311781140TrueHOUSE850000.0Unknown5.0305.01.00.00.00.0NaN0.00.0500.0NaN2.00.0GOODBrussels

Last rows

df_indexUnnamed: 0localityhouse_isproperty_subtypepricesalerooms_numberareakitchen_hasfurnishedopen_fireterraceterrace_areagardengarden_arealand_surfaceland_plot_surfacefacades_numberswimming_pool_hasbuilding_stateregion
256423006630066unknownTruehouse248000.0unknown3.0216.001.00.00.00.0NaN0.0NaNNaN780.03.00.0Not specifiedunknown
256434714847148unknownFalseAPARTMENT_BLOCK199000.0residential_sale0.0NaNNaNNaNNaNNaNNaNNaNNaN0.0220.02.0NaNNot specifiedunknown
256443079530795unknownTruehouse530000.0unknown3.0200.001.00.00.01.025.01.0NaNNaN452.03.00.0Not specifiedunknown
256452284222842unknownFalseground-floor119999.0unknown1.050.001.00.00.00.0NaN0.0NaNNaNNaNNaN0.0Not specifiedunknown
2564631588315882400TrueHOUSE325000.0Unknown3.0180.001.00.00.01.028.01.077.0200.0NaN2.00.0AS_NEWFlanders
256471810818108unknownFalseApartment6.0Apartment8.06350.710.00.00.0NaNNaN0.00.00.06800000.0NaN0.0Not specifiedunknown
256481927019270unknownFalseApartment2.0Apartment6.01883.680.00.00.0NaNNaN0.00.0NaN2450000.0NaN0.0Not specifiedunknown
256495107551075unknownFalsePENTHOUSE1396000.0residential_sale3.0202.001.0NaN0.0NaNNaN157.0NaN0.0NaNNaN0.0AS_NEWunknown
2565010129101298400FalseAPARTMENT129000.0Unknown2.066.001.00.00.00.00.00.00.00.066.00.00.0GOODFlanders
2565111940119401040FalseAPARTMENT475000.0Unknown2.0175.001.00.00.01.025.00.00.00.07.00.00.0AS_NEWBrussels